Hierarchical Permutation Complexity for Word Order Evaluation

نویسندگان

  • Milos Stanojevic
  • Khalil Sima'an
چکیده

Existing approaches for evaluating word order in machine translation work with metrics computed directly over a permutation of word positions in system output relative to a reference translation. However, every permutation factorizes into a permutation tree (PET) built of primal permutations, i.e., atomic units that do not factorize any further. In this paper we explore the idea that permutations factorizing into (on average) shorter primal permutations should represent simpler ordering as well. Consequently, we contribute Permutation Complexity, a class of metrics over PETs and their extension to forests, and define tight metrics, a sub-class of metrics implementing this idea. Subsequently we define example tight metrics and empirically test them in word order evaluation. Experiments on the WMT13 data sets for ten language pairs show that a tight metric is more often than not better than the baselines.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Neurally Plausible Encoding of Word Order Information into a Semantic Vector Space

Distributed models of lexical semantics increasingly incorporate information about word order. One influential method for encoding this information into high-dimensional spaces uses convolution to bind together vectors to form representations of numerous n-grams that a target word is a part of. The computational complexity of this method has led to the development of an alternative that uses ra...

متن کامل

Words over an ordered alphabet and suffix permutations

Given an ordered alphabet and a permutation, according to the lexicographic order, on the set of suffixes of a word w, we present in this article a linear time and space method to determine whether a word w′ has the same permutation on its suffixes. Using this method, we are then also able to build the class of all the words having the same permutation on their suffixes, first of all the smalle...

متن کامل

A Discriminative Syntactic Model for Source Permutation via Tree Transduction

A major challenge in statistical machine translation is mitigating the word order differences between source and target strings. While reordering and lexical translation choices are often conducted in tandem, source string permutation prior to translation is attractive for studying reordering using hierarchical and syntactic structure. This work contributes an approach for learning source strin...

متن کامل

Permutation complexity of the fixed points of some uniform binary morphisms

The notion of an infinite permutation was introduced in [1], where were investigated the periodic properties and low complexity of permutations. Similarly to the definition of subword complexity of infinite words, we can introduce the notion of the factor complexity of a permutation as the number of distinct subpermutations of a given length. The notion of a permutation generated by an infinite...

متن کامل

Ergodic Infinite Permutations of Minimal Complexity

An infinite permutation can be defined as a linear ordering of the set of natural numbers. Similarly to infinite words, a complexity p(n) of an infinite permutation is defined as a function counting the number of its factors of length n. For infinite words, a classical result of Morse and Hedlind, 1940, states that if the complexity of an infinite word satisfies p(n) ≤ n for some n, then the wo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016